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ABSTRACT 


The intelligence making process, often described as the intelligence cycle, consists of 
phases. Congestion may be experienced in phases that require time-consuming tasks such 
as translation, processing and analysis. To ameliorate the performance of those time- 
consuming phases, a preliminary classification of intelligence items regarding their 
relevance and value to an intelligence request is performed. This classification is subject 
to false positive and false negative errors, where an item is classified as positive if it is 
relevant and provides valuable information to an intelligence request, and negative 
otherwise. The tradeoff between both types of errors, represented visually by the 
Receiver Operating Characteristic curve, depends on the training and capabilities of the 
classifiers as well as the classification test performed on each item and the decision rule 
that separates between positives and negatives. 

An important question that arises is how to best tune the classification process 
such that both accuracy of the classification and its timeliness are adequately addressed. 
An analytic answer is presented via a novel optimization model based on a tandem queue 
model. 

This thesis provides decision makers in the intelligence community with measures 
of effectiveness and decision support tools for enhancing the effectiveness of the 
classification process in a given intelligence operations scenario. In addition to the 
analytic study, numerical results are presented to obtain quantitative insights via 
sensitivity analysis of input parameters. 
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EXECUTIVE SUMMARY 


Intelligence provides leaders with information and knowledge to support decision 
making. The process of producing intelligence is commonly described as a cycle of 
phases that begins with the issuing of an information request and ends with the 
dissemination of a coherent assessment to the relevant consumers. The new information 
leads to an update of the information requests and hence the cyclic nature of the process. 

Today’s vast usage of communication results in a glut of information, which may 
create bottlenecks in phases that require close attention to each information item such as 
translation, processing and analysis. Moreover, as the timeliness of the information 
becomes more crucial, the bottleneck’s effect becomes more critical. 

To ameliorate the performance of those time-consuming phases, a preliminary 
classification of items as to their relevance to an information request is performed. 
Nevertheless, this binary classification process requires additional resources such as 
personnel and time, and it is subject to false positive and false negative errors, where an 
item is classified as positive if it is relevant and provides additional information to an 
intelligence request and negative otherwise. The tradeoff between both types of errors is 
represented by a functional relationship called the Receiver Operating Characteristic 
(ROC) curve. 

The quality of the classification, as manifested by the prevalence of both types of 
errors on the ROC curve, depends on several key parameters. Some are strategic 
parameters of the system, such as the training of the classifiers and the overhead costs 
involved with the system, and some are tactical parameters that can be adjusted within 
the operation of a given classification system, such as the time spent on the classification 
of each item and the decision mechanism that is used to determine the nature of the item 
according to the classification rule. 

An important issue that arises concerns the optimal tuning of the tactical 
parameters of the classification process so that both the accuracy of the classification and 
the timeliness of the resulting intelligence product fall within certain desired ranges. We 

xiii 



consider models in which increased classification skill is associated with increase in the 
mean classification service time. The model and analysis presented in this thesis are a 
first modeling step towards balancing investments in the intelligence cycle, a key 
operations research related issue that was emphasized by the Defense Science Board's 
Advisory Group on Defense Intelligence, in a report on “Operations Research 
Applications for Intelligence, Surveillance and Reconnaissance (ISR)” in 2009. 

In this thesis we make the following contributions: first, we provide decision 
makers in the intelligence community with measures of effectiveness (MOEs) to assess 
the classification process in a given intelligence operations scenario. An item is 
considered a true positive if a well-trained analyst, having enough time and resources, 
would classify it as positive. Two MOEs are suggested: the first measures the 
performance of the classification in terms of the achieved true positive rate, and the 
second measures its cost-effectiveness, where cost is assumed to be driven by the time 
spent on each item and the effectiveness is measured by the number of positive items 
produced. Ealse positives are not directly penalized; however, the analysis time required 
to process them reduces the efficiency of the system. An analytic answer for the 
aforementioned tradeoff between classification accuracy and timeliness is obtained via an 
optimization model based on a tandem queuing model for classifying intelligence items 
in the presence of limited resources and time constraints; the model assumes that the 
contribution of an item does not change while it is waiting to be processed. The 
optimization model adjusts the values of the classification process tactical parameters in 
order to maximize the first performance MOE, namely the achieved true positive rate. 

The effectiveness of the classification, as manifested by the true positive rate 
achieved at optimality, is measured with respect to the true positive rate of the system 
when no classification is implemented at all. This measure allows us not only to quantify 
the benefit of the classification under a given scenario, but also to compare the added 
value among different scenarios, allowing thumb rules for better classification resource 
allocation. 

The main parameters and relationships that affect the performance of the 
classification process are identified and used in developing the model. Based on the 



optimal results, we obtain measures of effectiveness for decision makers to compare the 
performance of the classification in different scenarios. In addition to an analytic study of 
the model, which results in qualitative insight, we also discuss numerical results to obtain 
quantitative insights. 

Using the implemented model, different intelligence operations scenarios are 
studied and compared. We consider two scenarios of timeliness: a tactical scenario such 
as tactical engagements on the battlefield in which intelligence information is needed 
quickly (e.g. “ticking time bomb” scenario), and a strategic scenario such as long term 
armament transactions. Three scenarios are considered with respect to the source quality, 
which is defined as the fraction of true positive items among all items in the source: low, 
medium and high value sources. 

We have shown that for the implemented model, the larger the fraction of items in 
the source that are true positives (higher quality source) the less beneficial it is, in both 
measures of effectiveness, to implement the classification. In addition, for low quality 
sources (i.e., fewer true positives) classification improves the true positive rate for 
tactical scenarios more than for strategic ones. For high quality sources, the opposite 
applies. In addition, a cost-effectiveness study of the relationship between the classifier 
cost and the classification capability limit is developed; this relationship can be used to 
compare different classifiers. Specifically it is shown that for a high-quality source, it is 
more cost-effective to allow the analysts to directly use items from the source without 
pre-classification, despite the limited resources. Given the cost of the classification, the 
breakeven source quality in which both alternatives, namely with and without pre¬ 
classification, bear the same cost can be estimated. 
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I. 


INTRODUCTION 


A. INTRODUCTION 

Intelligence provides leaders with information and knowledge to support decision 
making. The process of producing intelligence is commonly described as a cycle of 
phases that begins with the issuing of an information request and ends with the 
dissemination of a coherent assessment to the relevant consumers. The new information 
leads to an update of the information requests and hence the cyclic nature of the process. 

Today’s vast usage of communication results in a glut of information, which may 
create bottlenecks in phases that require close attention to each information item such as 
translation, processing and analysis. Moreover, as the timeliness of the information 
becomes more crucial, the bottleneck’s effect becomes more critical. 

To ameliorate the performance of those time-consuming phases, a preliminary 
classification of items is performed to distinguish positive items from negative ones. This 
involves an item being classified as positive if it is relevant and provides additional 
information to an intelligence request and negative otherwise. A true positive is an item 
that would be classified by the analyst as positive. Nevertheless, this binary classification 
process requires additional resources such as personnel and time, and it is subject to false 
positive and false negative errors. The tradeoff between both types of errors is 
represented by a functional relationship called the Receiver Operating Characteristic 
(ROC) curve. 

The quality of the classification, as manifested by the prevalence of both types of 
error on the ROC curve, depends on several key parameters. Some are strategic 
parameters of the system, such as the training of the classifiers and the overhead costs 
involved with the system, and some are tactical parameters that can be adjusted within 
the operation of a given classification system, such as the time spent on the classification 
of each item and the decision mechanism that is used to determine the nature of the item 
according to the classification rule. 
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B. MOTIVATION AND RESEARCH FOCUS 


The motivation of this research is two-fold. First, this research provides decision 
makers in the intelligence community with measures of effectiveness (MOEs) to assess 
the classification process in a given intelligence operations scenario, where a desired item 
in terms of relevance is called a positive or if otherwise, negative. Two MOEs are 
suggested: the first measures the performance of the classification in terms of the 
achieved false negative rate, and the second measures its cost-effectiveness, where cost is 
assumed to be driven by the time spent on each item and the effectiveness is measured by 
the number of correctly identified positive items. Ealse positives are not directly 
penalized; however, the analysis time required in order to process them reduces the 
efficiency of the system. 

The second motivation is to optimize the tactical parameters of the classification 
process with respect to the first performance MOE, and study the effect of input 
parameters, such as inflow rate, as well as the strategic parameters on both MOEs via 
sensitivity analysis. 

In this thesis, we develop a queuing model, embedded in an optimization model, 
which provides an analytical solution for the question of how to best tune the 
classification process such that both accuracy of the classification, as well as its 
timeliness, are adequately addressed. The model is implemented using the General 
Algebraic Modeling System (GAMS) and solved using the CONOPT3 non-linear 
programming solver (Brooke et ah, 1998; Drud, 2005). 

C. CONTRIBUTIONS OF THIS WORK 

The main contribution of this thesis is a novel optimization model for classifying 
intelligence items in the presence of limited resources and time constraints. The main 
parameters and relationships that affect the performance of the classification process are 
identified and used in developing the model. Based on the optimal results, we obtain 
measures of effectiveness for decision makers to compare the performance of the 
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classification in different scenarios. In addition to an analytic study of the model, which 
results in qualitative insight, we also implement the model numerically and obtain 
quantitative insights. 

D. STRUCTURE OF THE THESIS AND CHAPTER OUTUINE 

This thesis has five chapters. Following Chapter I (Introduction), Chapter II 
provides background information on the intelligence process and, specifically, on the 
problem addressed in this thesis, which is optimizing a binary classification process. In 
Chapter III, the operational setting is presented and the model is formulated and 
discussed. In Chapter IV, we present numerical results from the model and provide a 
brief sensitivity analysis of the input variables. Chapter V summarizes the research, 
presents the main findings and insights, and discusses potential future work in the area. 
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II. BACKGROUND 


A. INTELLIGENCE AS A PRODUCT 

Intelligence is defined by the U.S. Department of Defense (DoD) as "the product 
resulting from the collection, processing, integration, evaluation, analysis, and 
interpretation of available information concerning foreign nations, hostile or potentially 
hostile forces or elements, or areas of actual or potential operations.Besides the use of 
intelligence as an indispensible tool to support a national leader’s decision making 
process, competitive intelligence (Cl) has emerged in recent decades as an environment 
meant to provide a competitive edge for a privately owned organization (Khaner, 1998). 

Thus, intelligence is a product of a process termed the intelligence process, 
defined by the DoD as "the process by which information is converted into intelligence 
and made available to users." The root of the intelligence process lies in the need for 
information by decision makers at every level. 

The intelligence process is most commonly described as a feedback process called 
the intelligence cycle, which is a continuous investigation that allows decision makers to 
collect relevant information for supporting informed decisions. The intelligence cycle 
consists of five key elements^’S (see Figure 1): (1) Planning and Direction (2) Collection 
(3) Processing and Exploitation (4) Analysis and Production (5) Dissemination. 


^ DoD Dictionary of Military Terms, "Intelligence," 
http://www.dtic.mi1/doctrine/dod_dictionary/data/i/4850.html Intelligence. 

2 Central Intelligence Agency, "Factbook on Intelligence: The Intelligence Cycle," 
http://www.fas.org/irp/cia/product/fact97/intcycle.htm. 

3 FBI, Intelligence Cycle, http://www.fbi.gov/about-us/intelligence/intelligence-cycle. 
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Figure 1. The intelligence cycle 


The intelligence cycle is merely a model describing the outline of the intelligence 
production process and, like any other model, is a simplification and abstraction of a 
much more complex process. Despite several arguments that have been raised against the 
oversimplification of the intelligence process as the intelligence cycle (Richelson, 1999; 
Hulnick, 2006, it is useful for exploring tradeoffs and interactions among its components. 

Miller et al. (2004) constructed an aggregated simulation model according to the 
aforementioned intelligence cycle that allows comparisons between different structures of 
the intelligence process. The comparison is based on four intuitive measures of 
performance: quality, quantity, timeliness, and satisfaction of information needs. Bose 
(2008) lists six measures of effectiveness for the intelligence product in competitive 
intelligence environment: accuracy, objectivity, usability, relevance, readiness, and 
timeliness. Nevertheless, each phase of the intelligence cycle may require adapted 
measures that refer to its specific role in the cycle. 

1. Planning and Direction 

The intelligence cycle is initiated with the identification of information needs, 
which are called intelligence requirements (IR). An intelligence agency is usually 
required to produce intelligence that serves decision making with respect to a list of IRs 


6 







issued by possibly different users. The planning and direction phase identifies these IRs 
and prioritize them so subsequent phases will adjust their operation accordingly. The 
planning and direction responds to the outcomes at the end of the intelligence cycle as 
well, since the delivered intelligence generates new requirements and may result in re¬ 
prioritization of the existing IRs. 

2. Collection 

The collection of intelligence incorporates a variety of means to gather raw 
information from which subsequent phases produce finished products. Many different 
intelligence collection disciplines exist (Johnson & Wirtz, 2004), most notably: (1) 
Human Intelligence (HUMINT) collects information from persons such as agents and 
defectors, (2) Signal Intelligence (SIGINT) collects information by intercepting signals 
between people such as communication lines (known as communications intelligence, 
COMINT) and electronic emissions not intended for communications (known as 
electronic intelligence, ELINT), (3) Imagery Intelligence (IMINT) collects information 
from imaging systems such as satellite images and aerial photography and (4) Open 
Source Intelligence (OSINT) which aims at harvesting open sources such as TV, radio 
and newspapers for information. This source of information has been given new life with 
the spread of the internet and the proliferation of information in other broadcasting media 
(Hulnick, 2006). 

In order to collect the raw information from available sources, an intelligence 
collection plan is generated. The collection plan should provide enough raw information 
for subsequent intelligence cycle elements so that the IRs issued by the planners are 
adequately satisfied. The collection plan follows the IRs' priorities because usually the 
collection resources are scarce compared to the spectrum of requirements and potential 
collection efforts and sources. 

3. Processing and Exploitation 

The processing phase is designated to transform the raw information that was 

collected into products that may be used in the analysis phase. The nature of the collected 

raw material dictates the type of operations to be included in the processing effort, as well 
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as the required analysis capabilities. Common processing operations include data 
reduction, noise reduction, decryption, language translations, context clarification and 
more. Processing also includes the loading of the collected data into easily accessible 
databases where it can be exploited for use by the analysts later on. 

The processing phase is the first contact with the raw information after it has been 
collected, and it is usually performed by those who have the background to understand 
the environmental and operational context in which it was collected. For example, when 
processing a raw IMINT product, technical parameters such as the location from which 
the image was taken, the time of the day, etc., can enhance the information content of the 
raw product in order to give the analyst a richer context. We call the people and 
technologies used during the processor phase the processors. 

In many scenarios, processing resources can become a bottleneck, e.g. limited 
number of translators, limited computing resources, etc. When such limitations are 
present, a preliminary classification, in which some collected items are filtered out, must 
be done in order to reduce the flow of information and thus affect the processing 
throughput. Even tasks that simply require formatting and context enrichment of an item 
will create a bottleneck when the amount of collected information increases. This is 
particularly the case when it comes to SIGINT, which deals with a glut of information 
requiring relatively significant processing effort that includes translation and context 
enrichment. 

4. Analysis and Production 

Johnston (2005) defines intelligence analysis as “the application of individual and 
collective cognitive methods to weigh data and test hypotheses within a secret socio¬ 
cultural context.”^ 

The job of the analyst, as described by Hulnick (2006), is to evaluate the 
relevancy of the processed intelligence and put it in perspective with respect to current 


4 Analytic Culture in the U.S. Intelligence Community, Chapter One, 
https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-publications/books-and- 
monographs/analytic-culture-in-the-u-s-intelligence-community/chapter_l.htm. 
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assessments. The FBI^ includes under the analysis phase the integration, evaluation and 
analysis of the available data, and the subsequent preparation of the final intelligence 
product. 


a. The Foraging and Sense Making Loops 

Card and Pirolli (2005) provided an empirical descriptive study of the 
intelligence analysis using a cognitive task analysis. In their study, the analysis process is 
separated into two loops: (1) the foraging loop in which valuable information is culled 
and becomes evidence (Pirolli & Card, 1999), and (2) the sense-making loop (Russell et 
ah, 1993) in which an integrated story that explains and presents the gathered evidence is 
iteratively developed. The process is described in Figure 2. 



Figure 2. Nominal sense-making loop for some types of intelligence analysts^ 


5 FBI, Directorate of Intelligence, Intelligence Cycle, http://www.fbi.gov/about- 
us/intelligence/intelligence-cycle. 

6 National Visualization and Analytics Center, Illuminating the Path: The Research and Development 
Agenda for Visual Analytics, p 44. http://nvac.pnl.gov/docs/RD_Agenda_NVAC_chapter2.pdf 
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An important difference between the two loops in our context is that the 
foraging loop, which follows the processing phase, deals with each item independently 
until it becomes evidence. On the other hand, the sense-making loop deals with the 
integration of a collection of evidence files and their presentation, rather than with the 
independent evidence files. Therefore, measuring and referring to intelligence material in 
terms of items is relevant up to the sense-making loop. 

The foraging loop begins by searching and filtering the valuable items 
from external, usually vast, repositories. The relevant items are gathered in a virtual 
holder called the shoebox. The shoebox items are read and used to create evidence files, 
which are the building blocks of the sense-making loop. 

b. Balancing Foraging and Sense Making 

The analyst is required to balance his time between gathering evidence 
and integrating it into a coherent story. During this process, the analyst usually has to 
deal with uncertainties, missing information and high complexity. It is also common to 
have to deal with contradictory evidence due to counter-intelligence, unreliable resources, 
misconceptions, etc. 

The intelligence community has focused during the past decades on 
improving the intelligence collection capabilities in order to keep up with the information 
revolution and spread of communication technologies. As a consequence, intelligence 
analysis is far exceeded now by the collection capabilities, and is required to focus on 
detecting relatively few valuable information items and integrate them to support 
reasoning (Heuer, 2001). 

If no classification of information as valuable or invaluable is made prior 
to the analysis, the amount of available information may be enormous and significantly 
larger than the capability of the analyst to digest. Hence, there is an acute need for a 
method to cull the valuable information to be analyzed and discard the rest. The 
challenge of reconciling the large collection capacity with the limited analysis capability 
has been addressed in recent studies, such as graph-based algorithms (Coffman et ah, 
2004), risk-based methodologies for scenario tracking (Horowitz & Haimes, 2003) and 
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Bayesian fusion of information (Pate-Comell, 2002). These methodologies offer 
powerful tools to the analyst in performing the sense-making loop when coping with 
current analysis challenges. Greitzer (2005) tackles the challenge of assessing the impact 
of a tool or a methodology in the intelligence analysis context. 

Information retrieval efforts (Singhal, 2001), which are widely applicable 
in civilian applications, constantly improve the ability to identify and retrieve the 
valuable information out of a large database. Nevertheless, if not bounded, the foraging 
efforts may take much of the analyst’s time, resulting in less time available for sense¬ 
making, and a degraded overall performance. 

In order to overcome this risk, the analyst may take advantage of the a- 
priori classification that the intelligence cycle offers at the processing phase. Early 
classification is implicitly employed when unprocessed or unusable information items are 
discarded at the processing phase, e.g., information that could not be translated, 
decrypted, etc. Nevertheless, early classification can also be performed in order to meet 
the analyst’s requests regarding the desired items. Items that meet those requests are the 
ones to be processed and submitted to the shoebox before other items, thus their relative 
portion increases. 

5. Dissemination 

The final phase requires the distribution of the intelligence products to the 
consumers who initiated the corresponding IR. The disseminated intelligence may have 
different formats, such as reports, bulletins, assessments, studies etc., and may be 
distributed in different temporal manners: periodically, upon retrieval, upon request etc. 
Once the products are disseminated, the cycle goes back to planners and directors to re¬ 
prioritize IRs and issue new ones. 

B. PROCESSING PHASE CLASSIFICATION 

A key difference between the processing and analysis phases lies in the nature of 
the classification. While classification by an analyst is tentative since the item may be 
revisited during one of the following foraging iterations, classification by the processor is 
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irreversible since items that are filtered out are discarded. Since analysis and processing 
are usually done by separate organizations (Defense Science Board, 2009), that 
difference only deepens due to immobility of resources between the phases and agenda 
conflicts. Figure 3 displays how items discarded in the processing phase, either before 
actual processing (top) or after (bottom), do not reach the repository that serves the 
analysis phase later on. The top figure displays a bottleneck at the processing phase and 
the bottom displays a bottleneck at the analysis phase. 
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Figure 3. Pre-processing classification (top) due to a bottleneck at the processing 
phase, e.g., translation and Post-processing classification (bottom) due to a 
bottleneck at the analysis phase 


If the analyst is only interested in a small portion of the collected information, 
processing all of it may result in tilting the balance between foraging and sense-making 
towards the first, resulting in a degraded overall performance. In particular, this is the 
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case when each processed item is read by the analyst, or when the processed items 
repository has limited capacity, especially in SIGINT. In this case, the analyst may 
delegate the search and filter phase to the processing phase. Hence, even when the 
processing phase does not have an inherent bottleneck, e.g., when it is automated, a 
requirement for processor classification may arise in order to meet limitations at the 
analysis phases. 

C. BINARY CLASSIFICATION PROCESS AND MEASURES 

As pointed out in previous sections, the collected information goes through a 
binary classification process; an item is either classified as valuable to a certain IR and 
true in the sense that it conveys ground truth, or otherwise. Items that are declared as non¬ 
valuable or untrue are discarded, while the rest are processed and make their way to the 
foraging loop of the analysis phase. Since intelligence items are handled individually, the 
classification process becomes, in essence, a typical binary classification that assigns 
each item into two groups: valuable and non-valuable. The classification is done based on 
a test that comprises a set of evaluation questions and a decision mechanism that draws 
the line between a valuable item and a non-valuable one given the results of the test. For 
example, the decision on whether an IMINT product is valuable or not for a certain IR is 
naturally based on a test, which among other questions asks: is the image changed since 
the last processed image of the same area of interest; does it answer the IR; is it usable 
enough for analysis, etc. 

I. Sensitivity and Specificity 

In order to characterize the performance of a binary classifier, two statistical 
measures are widely used: Sensitivity and Specificity. If we refer to a valuable item as a 
“positive” and to a non-valuable item as a “negative,” the sensitivity p is the probability 
of correctly classifying a positive, while the specificity q is the probability of correctly 
classifying a negative: 

p = Pr (^positive id I positive item) and q = Pr (^negative id I negative item) 
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The sensitivity and specificity of a classifier can be estimated by measuring 
empirically the performance of the classification on a set of items. Let TP be the number 
of true positives (i.e., positives correctly classified as positives), TA^^be the number of true 
negatives, FP be the number of false positives (i.e., negatives incorrectly classified as 
positives) and FN be the number of false negatives. In the experiment, the TP, TN, FP 
and FN are counted with respect to the experimenter’s knowledge, and in our context, the 
analyst’s. 

Given these counts, the estimates for the sensitivity, p, and the specificity, q, are 
given by: 

TP TN 

sensitivity = p = - and specificity = q = - (2.1) 

TP + FN J ^ ^ FP + TN 


The false positive rate (FPR) is the probability of incorrectly classifying a 
negative item, and the false negative rate (FNR) is the probability of incorrectly 
classifying a positive item. The specificity and sensitivity are the complement 
probabilities of the FPR and the FNR, respectively. That is: 


FPR = l-q = 


FP 

FP + TN 


and 


FNR = l-p = 


FN 

TP + FN 


( 2 . 2 ) 


A perfect classification is characterized by perfect sensitivity and specificity, 
i.e., p = q = l. On the other hand, an arbitrary classification, with certain probability r of 
declaring an item as positive, is characterized by any sensitivity and specificity 
probabilities that satisfy p = l-q = r. For example, p = q = 0.5 characterize a 
classification system that uses a fair coin to decide whether an item is positive or 
negative, while p = l,q = 0 implies no classification; all items are declared as positives. 

Given a single evaluation question answered on an item, the classification system 
decides whether to declare the item as positive or negative using a decision mechanism. 
By narrowing this decision mechanism, e.g., changing a threshold value, the 
classification system can increase the sensitivity while decreasing the specificity and vice 
versa. The decision mechanism represents a policy, which is executed in the classification 
process. When a longer set of questions is answered on each item, more information is 
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gathered; hence, a more informed decision can be made and the accuracy may improve. 
Therefore, there are essentially two degrees of freedom when planning a classification 
process: the test, namely the set of questions to be answered for each item, and the 
decision mechanism that depends on the possible outcomes of the test. 

The accuracy, ACC, measures the overall rate of correct classification, estimated 
TP + TN 

empirically as ACC =-, where P +is the total number of classified items. 

P + N 

While ACC used to be a common measure of effectiveness for summarizing the 
performance of a binary classification system, it has also been criticized as a poor and 
misleading metric by Provost et al. (1998) mainly because it assumes equal costs for both 
misclassification types {FP and FN) and that the distribution of the positives and the 
negatives in the target environment is known. Instead of the single-number metric, they 
propose the use of Receiver Operating Characteristic curve. 

2. Receiver Operating Characteristic (ROC) Curves 

A widely used tool to explore and present the tradeoff between the sensitivity and 
specificity of a binary classification system is a graphical plot called receiver operating 
characteristic or, in short, ROC curve (Swets, 1988; Fawcett, 2006). 

ROC curves characterize classification systems and allow comparison of systems 
as well as cost benefit analysis of the decision mechanism. These curves are common in 
machine learning, data mining, as well as in medicine and radar operation. 

The x-axis of the ROC curve is the false-positive rate, FPR = l-q, and the y-axis 
is the sensitivity p . The ROC space is the (theoretically) feasible region of the ROC 
curve, namely the square region defined by the points (0,0), (0,1), (1,1), (1,0). The point 
(0,l) represents a perfect classification system with no errors. A point on the 

diagonal line p = I-q represents a random guess with certain probability r = p^^ of 
declaring an item as positive. 
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A piecewise linear approximation of the ROC curve may be generated by a series 
of experiments with the classification system. For certain decision mechanisms, the 
system classifies a set of items and the probabilities are computed out of the classification 
results to create a single point on the ROC curve. 

Because one can always reverse the cue, and assuming that the classifier is not 
biased due to deception or other measures, the random guess is the worst case scenario 
and therefore no (l - p) values are possible under the diagonal p = l-q. 

The area under the ROC curve (AUC) is a scalar measure of performance for the 
classification system. It allows a high-level comparison among different classification 
systems on the entire ROC space. The AUC can vary from 0.50 (worst-case scenario) to 
1.0 (perfect classification) when assuming that no classification system can go below the 
diagonal (Marzban, 2004). 

The adjustment of the ROC curve of an intrusion detection system given a limited 
investigation capacity is explored in the field of computer networks security by Yue and 
Cakanyildirim (2010). They use a decision tree approach in which the cost of each 
investigation is weighted against the damage by such an intrusion. This type of cost- 
based analysis is hard to apply under the context of intelligence since the benefit of a 
single item is measurable only with respect to the final product of the cycle, rather than 
its value as an independent item. Therefore, the notion of damage per unidentified item is 
artificial. 

3. Precision and Recall 

A closely related area of research is Information Retrieval, which is the science of 
searching for documents corresponding to a certain query in a set of general documents, 
usually held in a database or another repository. Information Retrieval uses similar 
performance measures for the system, namely precision and recall. Precision is the 
fraction of valuable documents that are retrieved among the total number of documents 
the same search retrieved. Recall is the fraction of valuable documents retrieved out of 
the total number of valuable documents in the search set. For an empirical trial, the 
precision and recall of a classification system are estimated using the same parameters as 
before, and their relationship to the sensitivity, p, and the specificity, q, is given by: 
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TP pP 

precision = -=-;-^- 

TP + FP p-P + {l-q)-N 


and recall = 


TP 


(2.3) 


TP + FN 

where TP, FP are the number of true positives and false positives, respectively, FN is the 
number of false negatives, and P,N are the total number of positives and negatives in the 
trial set. 

While both measures were originally developed to characterize set retrieval, 
current research deals with ranked retrieval models that ranks results by estimated 
likelihood of relevance to the given query. Among the measures that take into account the 
ranked order of the results are mean average precision, MAP, and normali z ed discounted 
cumulative gain, NDCG (Jarvelin & Kekalainen, 2002). 

For information foraging problems that aim at very rare but valuable information, 
recall is still used as the primary performance measure. In the extreme case of 
information foraging, the problem becomes an information availability problem in which 
the information seeker is uncertain regarding the very existence of the information that 
was searched for. In these problems, which often include intelligence analysis situations, 
the importance of recall as a measure of effectiveness becomes critical (Pirolli, 2009). 

Similar to the ROC curves used in the sensitivity-specificity terminology, 
Precision-Recall (PR) curves are used in fields such as machine-learning and data mining 
in order to illustrate graphically the performance of a classification system. One 
advantage of the ROC curve over the PR-curve is its independence of the P and N values 
that are required to compute the precision. Nevertheless, the PR curve is equivalent to the 
ROC curve, in the sense that a domination relationship between curves in the ROC space 
exists if and only if it exists in the PR space (Davis & Goadrich, 2006). 

D. INTELLIGENCE OPERATIONS RESEARCH 

Intelligence Operations Research (OR) refers to the implementation of OR 
techniques to benefit and improve the intelligence process, by modeling and solving 
specific problems of the intelligence community. A broad overview of the subject is 
provided by Kaplan (2010b). 
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The “Advisory Group on Defense Intelligence,” which is a committee of the 
Defense Science Board to examine and advise on matters related to defense intelligence 
in the DoD, examined the use of OR in Intelligence, Surveillance and Reconnaissance, 
ISR (Defense Science Board, 2009) and concluded that “Operations Research is applied 
inconsistently throughout the Defense and ISR communities. These communities do not 
possess standard OR processes and practices, a consistent organizational model, or a 
consistent commitment to the use of OR” (page 37). 

In order to establish OR as a beneficial methodology for intelligence, two test 
cases were suggested: (1) Balancing investments in the intelligence cycle (2) Investment 
decision making in biometrics technologies. The significant challenge that the task force 
pointed out is how to formulate the objective function of the intelligence product and its 
production phases. Another issue was the diversity of organizations in charge of the 
various phases of the intelligence production process, making cross-phase resource 
allocation more difficult to implement. 

Few operations research models were suggested for the intelligence work itself. 
Steele (1989) models the time as until a secret is disclosed, when shared among members 
of a group. 

Skroch (2005) uses an interdiction model to optimize interdiction resources in 
order to hinder a nuclear weapons project. Another proposed model (Pinker et ah, 2009) 
is a mixed integer linear programming model in which the proliferator minimizes the 
time from detection to completion rather than simply minimizing time to completion. 

Kaplan (2010a) describes models for infiltration and interdiction of terror plots by 
HUMINT agents using “terror queues,” Markovian queues in which terror plots are 
customers and the HUMINT agents are the servers interdicting the served terror plots. 
The reneging terror plots are those that are executed before being interdicted. 
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III. THE MODEL 


A. CHAPTER OVERVIEW 

This chapter presents the model of the intelligence processing-analysis system 
described in Chapter II and discusses its assumptions. The proposed model is a tandem 
queue where each station corresponds to a phase in the system: processing and analysis. 
The basic queuing model is implemented in an optimization model that determines 
operational parameters. 

As discussed in Chapter II, many research efforts have been focused on both the 
processing and analysis phases of the intelligence cycle, proposing qualitative and 
quantitative methods to overcome the challenges associated with the operation of each 
phase. Nevertheless, to the best of our knowledge, none of these studies focused on 
quantifying operational parameters and allocating resources between the two key phases 
in the cycle: processing and analysis. Even if efficient and effective practices are 
implemented in each phase separately, it may well be that as a combined system the 
operation is not optimal. Overcoming this possible sub-optimality is the main motivation 
for the model described in this chapter. The main goal of our model is to determine the 
optimal values of the tactical parameters of the classification and compare the 
improvement of the optimal setting under different scenarios. To keep the model general 
we consider a highly aggregated form of each phase, focusing on easy to measure 
characteristics of the phase, such as service rates and quality of performance, rather than 
on its detailed modus operand!. 

B. OPERATIONAL SETTING 

Before formulating the model, we describe the operational setting and pose some 
assumptions. The basic scenario considers a single Intelligence Requirement (IR) and a 
given collection plan, which determine the characteristics of the intelligence items for 
processing. For the sake of brevity, we simply refer to those as items. The items that are 
going through the processing phase consist of two types with respect to the specified IR: 
valuable items, called henceforth positives (P), and worthless items, called henceforth 
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negatives (N). Positives are items that are both relevant to the IR and convey additional 
ground truth according to the analyst, and negatives are items that are either irrelevant to 
the IR, repetitive, or contain incorrect information, again, according to the analyst. 

As discussed in Chapter II, the process we model is described as follows: an 
incoming item is first classified as positive or negative based on the information gathered 
on that item by the processor. 

If the item is classified as a positive then it is passed on to the analysis phase; 
otherwise, the item is discarded and cannot be revisited later on. The analyst then inspects 
each submitted item and establishes its significance and implication on current 
knowledge base, e.g., by updating the IR or disseminating the newly retrieved 
information to relevant decision makers. 

The classification at the first phase is subject to false-negative and false-positive 
errors {FN and FP). While false-positive items increase the number of items that are 
passed on to analysis, and thus increase the load for the analysts, they will eventually be 
detected as such by the analysts and therefore will cause no additional harm. On the other 
hand, the false-negative items comprise valuable information that is lost, at least until it 
may reappear in another positive item. Figure 4 describes the processing-analysis system. 
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Figure 4. The processing-analysis system 
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C. THE TANDEM QUEUE MODEU 

Recall that we assume that information arrives in a discrete form of items. The 
stochastic counting process of items arriving for processing is the sum of two 
independent homogeneous Poisson processes: (1) positives {p(t);t>0} with an arrival 
rate/Ip and (2) negatives > O} with an arrival ratc/lj^ . Given an item i, we denote 

i e Pif it belongs to P{t), and i e N otherwise. 

Let j=l,2 be the index of the processing station and the analysis station, 
respectively, and let Aj,jUj e (0,oo), j=l,2, be the arrival and service rates at station j. 

Assuming that the inter-arrival and service times are exponentially distributed and 
independent of everything, the resulting model is a tandem M/M/1 queue. Given the 
arrival process rates/Ip,/l^, the sensitivity p and specificity q of the classification process, 
the arrival rate of items for analysis is: 

A^= pAp+[l-q)A^ ( 3 . 1 ) 

The tandem queue is displayed in Figure 5. Note that while the arrival processes 
are displayed separately in the figure for clarity purposes, the processing station cannot 
separate between positives and negatives before service completion, and sees both as a 
single arrival processes. In addition, the independence assumption implies that previous 
classifications have no effect on the true classification of incoming items, in the sense 
that there is no cumulative learning from the intelligence products. 



Figure 5. The tandem M/M/1 model for the described process 
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The tandem queue is stable if Aj < jj.. for both stations j=l,2, in which case Wj, 

the long-run expected delay of an item in station j, is finite. The long-run expected delay 
of an item in the processing-analysis system, W, is given by: 

1^=14^^+^^^=—^^^— +-L-^— (3.2) 

M2~ ^2 M\~ \ M2~ P^p 

While the average service rate at the analysis station, , only affects , the 
average service rate of the processing phase, /u ^, affects both and the classification 
quality, manifested in p and q, which affects Tj and therefore also ■ Introducing a 

relationship between the classification quality and the rate in which it is performed allows 
us to capture tradeoffs between quality and quantity in the classification process at station 
1. Decreasing /u^, that is slowing down the classification, increases ligand allows 
performance of a longer test; we assume the quality of the classification increases as well. 
When //j increases, the opposite holds. In reality, increasing may be manifested, for 

example, by reducing the number of questions asked on each item in the processing 
phase, or answering these questions faster. We assume that a longer mean service time 

— is associated with a longer sequence of questions that the classifier is able to answer 
Px 

on each item; this will result in a higher classification accuracy. In the next section, we 
discuss the modeling of the relationship between and the quality of the classification 
as manifested in its sensitivity and specificity. 

D. THE CLASSIFICATION PROCESS 

As mentioned above, given an item, the classification process can be abstractly 
modeled as a sequence of questions used to gather information about that item. Once 
those questions are answered, a decision mechanism decides, based on the collected 
information embodied in the answers, whether the item should be declared as positive or 
negative. We define a test as a set of questions asked about each classified item. For 
example, for a test with a single question, which results in a scalar describing the item. 
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the decision mechanism may take the form of a threshold value that classifies an item as 
positives if the scalar value is higher than the threshold and negative otherwise. 

1. Modeling the Classification 

To characterize the classification process implemented in the processing phase, 
we define the binary classification setting as a vector of two elements: (1) the 
implemented test and (2) the decision mechanism used to declare an item as positive 
based on the test results. The method by which each one of the two elements is controlled 
in an operational environment is discussed in the subsequent sections. 

Each possible test can be used to plot empirically the ROC curve that describes 
the relations between the sensitivity p = Pr|rPIP} of the test and its specificity 

= Pr|rA^ I//} as one changes the decision mechanism, as discussed in Chapter II. 
Recall from Chapter II that the ROC curve is the sensitivity as a function of the false 
positive rate: p = f 

For each test, the resulting ROC curve of the classification starts at (0,0), which 
represents the trivial threshold where all items are declared as negatives, and ends at 
(1,1), which represents the trivial threshold where all items are declared as positives. We 
assume that the ROC curve is given in a general parametric form: 

P = /. (1 - for some parameter ^ e ] (3.3) 

Substituting (3.3) in the expression for A 2 given by (3.1) yields: 

^2 = 4+(1-^)4 (3-4) 

The parameter ^ is a measure of the quality of the classification. Without the loss 
of generality, when s assumes the value of its upper bound, , it implies the worst- 

case scenario of random classification. This scenario occurs when, regardless of its 
characteristics, an item is classified as P with a certain probability r and as N with 
probability 1-r. In that case the corresponding ROC curve is the diagonal p = l-q = r . 

On the other hand, when s reaches its lowest possible value, i.e., s = , the best 
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possible classification is achieved, that is P = /^ (l - <?, ) = 1 for all q, and in particular 
for q = \, perfect classification is achieved. We assume that when the test is expanded, 
i.e., more questions are answered on each item, s decreases. Figure 6 illustrates the 
above formulation in the ROC space for (l - < 7 , = (l- q^ where s e [O,l]. 



Figure 6 . ROC Space diagram for (l - < 7 , = (l - q^ and different values of s . 
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2. Controlling the Classification Setting 

Recall that the classification process is defined by two elements: (1) the test 
performed on each item before it is classified, and (2) the decision mechanism used to 
declare an item as positive. In this section, we discuss the method by which each element 
can be adjusted. 

Assume that the test performed on each item before classification is a subset of 
some global set of evaluation questions, each associated with the average time it takes to 
answer the questions and its corresponding ROC curve, which represents different 
decision mechanisms based on the information accumulated during the test. For example, 
we mentioned three questions with respect to the example on IMINT products described 
in Chapter II: (I) how suitable it is for analysis (e.g., quality of the image and its 
resolution), (2) how strongly is the image related to the IR, and (3) how significant is the 
change in the image since the last time the area was visited. Based on these questions one 
can theoretically create 2^ tests; e.g., only question 1; or only questions 2 and 3, etc. 
However, in reality, four tests might be sufficient since we can order the questions by 
some measure of effectiveness from 1 to 3 and then test i contains questions with 
rankings 0 through i. Test 0 contains no questions, and test 3 contains all of them. For 
each of those tests the service rate and the ROC curve can be computed empirically, by 
utilizing several decision mechanisms. In our example, test 2, which contains questions 
(1) and (2), may have a two-dimensional threshold mechanism, in which both the quality 
and the relation to the IR should exceed certain reasonable levels. 

Let the empty test be the test in which no question is included and let the complete 
test be the test in which all of the available questions in the global set are included. When 
performing the empty test, no information is collected on the item; therefore, the best 
classification is given by a random guess ROC curve. Naturally, the more questions one 
asks on each item, the better is the classification of items to either P or N, since more 
information is gathered to support the decision, and we assume no deliberate 
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disinformation. Thus, the value of s in the ROC curve does not increase when more 
questions are answered. When additional questions add no more information, the value of 
s stays fixed. 

When the service rate at the processing station, jj .^, increases due to reduction of 
the test length, that is, classifiers are instructed to process each item faster by addressing 
fewer questions, s gets larger, and the resulting ROC curve “shrinks” south-east, 
towards the random guess diagonal. Thus is assumed monotone increasing in //j. 

In addition, since it is reasonable to assume that relatively ineffective questions, i.e., 
questions that contribute less to the classification of the ROC curve with respect to the 
time they consume, may be removed first; is also assumed to be concave. 

The measurement of requires a thorough discussion. For a predefined test, 

that is, a sequence of questions fully addressed on each item, an experiment should 
include the measurement of both the mean time it takes to perform the test //j ' j and the 

corresponding ROC curve. The measurement of the mean time should refer to 
operationally achieved mean time, e.g., by excluding learning effects. The ROC curve 
can be empirically drawn using different decision mechanisms applied to the same 
information collected during the performance of the test. Given the empirical plot of the 
ROC curve, the best fitted value of s should be computed to approximate the retrieved 
curve as the functional form {\-q,s^ . Then, given these results for multiple tests, the 

functional form may be estimated. 

Let //| > 0 be the highest value of //j for which the complete test can be 

performed, meaning all possible questions in the global set can be addressed. In that case 
the system reaches the optimal classification capability - the minimal value of s . Let ^, 

<s< , be the classification capability limit, that is ^ (^//j ^ ^. On the other hand. 
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when //, becomes high enough such that no test can be performed, the classification 
reaches its worst performance - a random guess. Let Jl^ > //j be the smallest value of //j 
such that = 

For example, a linear relationship that satisfies the above two conditions is: 

= (g„ax-g) + g (3-5) 

M\ f£\ 

The bounds Ji^ and /u^ are estimated by the longest average time no other test can 
be performed except for the empty test and by the shortest average time to perform the 

complete test, respectively. The expected service time is — . The upper bound Ji^ takes 

/«i 

into account the overhead time per item at the classification station. For example, some 
internal administrative procedures may be required at the processing station regardless of 
the quality of the classification. 

The value of ^ is estimated from the ROC curve of the classification when the 
complete test is performed. Note that the region is the mathematical region for 

which the ROC curve is feasible, while the region [£,£■^ 3 ,^] is the achievable region by 
the classification. 

By substitution of in (3.3) and (3.4) respectively, we have: 

= for A (3-6) 

^2 ^/(l-^’A)4+(l-^)^iv (3-7) 

This formulation uses two sets of variables. The first set comprises the strategic 
parameters of the classification phase: the range of classification work intensities 

[//,, ^1 j, and the classification capability limit £. These characteristics can be 
controlled: the bounds and /u^ can be increased by training the classifiers and 
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performing the test more efficiently, and the classification capability limit e_ can be 
decreased by generating questions with a better ability to classify the items. 

The second set consists of the tactical variables //j, p and q, that decide the 
modus operandi in which the processing phase classifies items. These variables give two 
degrees of freedom for policy makers; first, the choice of //, determines the shape of the 

ROC curve by determining the value of . Then the choice of p determines the value 
of q on the ROC curve, and vice versa. Without an analysis station bottleneck, one 
would presumably set p = 1 - ^ = 1 for every value of //,; however in the presence of 
limited analysis resources and delay constraints, this setting becomes infeasible. 

In this thesis, we focus on the optimization of the classification by setting the 
value of the tactical decision variables, where the impact of the input variables is further 
explored in the analysis presented in Chapter IV. 

The value of p can be set by relaxing or restricting the decision mechanism for 
declaring an item as positive. To illustrate the adjustment of p and q, suppose the analysts 
are looking for items associated with a certain subject. Consider, for example, a one¬ 
dimensional test that, given a dictionary of terms, counts the number of term occurrences 
in each item. For that specific test example, the value of s depends on the quality of the 
dictionary of terms. The better the dictionary, i.e., the better it differentiates between 
positives and negatives, the value of s is lower. The decision mechanism is a simple 
threshold on the count. If we require zero occurrences of terms to declare the item 
positive, the classification achieves p=\-q = \, which represents a random guess with 
Pr(P) = 1. As we set the threshold higher, the number of items that meet the threshold 
decreases, including both positives and negatives. Therefore, the sensitivity p decreases 
since the false negative rate increases while the specificity q increases since the false 
positive rate decreases. At a certain point we may require more occurrences of terms than 
there exist, resulting in p=\-q = 0, which is a random guess with Pr(P) = 0 . 
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Suppose the dictionary contains n terms. If we generalize the test and count the 
number of occurrences for each one of the n terms, the decision mechanism becomes an 
n-dimensional function of the counts. 

3. The Alternative of No Classification 

In the case where classification is not implemented at all, the analysis station 
inspects the material in its raw form. We assume the worst-case scenario where the 
inspected items are randomly chosen from the flow of arriving items with a certain 
probability r, according to the analysis station service rate. Thus the system is now a 

single M/M/1 queue where W =--, in which \ is the arrival rate of all items to the 

processing server, both positives and negatives. Therefore, the sensitivity satisfies 
Pmm ~ ^ be expressed as: 

1 

Pi - 

P = ^ (3.8) 

A 

Expression (3.8) represents the lower bound on the sensitivity that may be 
achieved by the system, thus the notation . 

E. OBJECTIVES 

As discussed in Chapter II, intelligence products are commonly measured by 
quality, quantity, timeliness, and information needs satisfaction. Since we focus in this 
study only on the part of the intelligence process in which the final product is not yet 
tangible, the quality and information needs satisfactions are hard to estimate. 

Therefore, our measure of effectiveness is the sensitivity p of the system (also 
known as the recall), given an upper bound on the acceptable total expected delay W, 
which affects the timeliness of the intelligence product. 

This concludes the formulation of the framework for the basic model. The 
following sections present the optimization model with respect to the stated objective, 

which we call the "Classification Optimization Model." The classification optimization 
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model deals with the optimal setting of the classification at the processing phase, given 
that the arrival rates and the analysis service rate are fixed. In other words, the model 
deals with these questions: what should the service rate //j and the point (l-^, p) on the 

ROC curve corresponding to £■ (//j) be? In addition, the model allows sensitivity analysis 
for the values of the strategic variables. 

1. Cost-Effectiveness of the System 

When it comes to intelligence processing it is reasonable to choose the sensitivity 
ip) as the objective of the system. However, given the optimal performance of the 
system, one would like to measure its cost-effectiveness as well for decision making 
purposes. As was discussed in Chapter II, it is hard to quantify the value of an 
independent intelligence item in the wider context of the intelligence product. On the 
other hand, using the framework presented in this chapter, we can formulate expectancy 
based measures for the total cost of the coupled system. 

Let the cost of an analyst per unit time be one budget unit, and let b be the 
classifier cost per unit time. Since the best classifier is the analyst, the range of the 
classifier cost is given by 0<b< \. 

b 

The expected cost of an item declared as negative at the classification phase is — 

A, 

b 1 

while the expected cost of an item declared as positive costs —-i-since it goes 

Ml Ml 

through both stations. Therefore, the expected cost of the coupled system is given by 

X X 

b — + —. On the other hand, when no classification is performed, the expected cost is 
Ml Ml 

n X, 

given by ^ , where is given by (3.8), since each one of the p^^^X^ processed 
Ml 

items requires an expected service time of —. 

Ml 
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In order to incorporate the performance of the system into the cost measure, we 
look at the expected cost per correctly identified positive. Let B be the expected total cost 
of the system per positive, meaning the cost of both the classification and the analysis 
over a single period divided by the number of correctly identified positives processed 
during that period. For a coupled system the number of correctly identified positives is 

7 \ ^ 

b — + — 

given by pAp (see 3.1); thus we have B = ———— and for a system without 

pAp 


Pmm\ 

classification we have B = ——— 

Pmm^P 




We use the cost per correctly identified positive, B, to compare the cost- 
effectiveness of classification systems in different scenarios, when the classification 
setting optimizes the sensitivity. 


F. CLASSIFICATION OPTIMIZATION MODEL 

In this section, we formulate the model used to determine the optimal setting of 
the classification at the processing phase, as manifested by the variables p,q and //j, 
given the strategic parameters and the input parameters Ap,Api,p^, and the functional 
relationship . 

In the model we maximize the sensitivity p subject to the constraints explained 
previously, namely: (1) stability constraints on both stations in order to assure that each 
station is able to serve the flow of incoming items, (2) requirements that the sensitivity 
and specificity pair lies on the ROC curve, (3) the service rate at station one, //,, is 
restricted to be between the two bounds and , and (4) an upper bound on the 
acceptable total expected delay W . Formal definition of the problem is given in (3.9): 

max p 

p,q,jUi^V 

Subject to: 


(3.9) 
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• lUj > Aj j = 1,2 [Stability constraints] 

• 0< p,q< I [ROC Space constraints] 

• //, e ] [Classification rate range] 

• W <W [Delay upper bound] 

In the subsequent seetion, we show that the model ean be reduced to a two- 
dimensional optimization problem in //, and q by expressing the objeetive as a function 

following (3.6), substituting W following (3.2), and substituting 

X 2 = f (l-q,iUy)Ap-\-(l-q)Af^ following (3.7). Therefore the problem is given: 

Subject to: 

• [Stability constraint on station 1] 


• 0 < < 5 ^ < 1 
• Ae[^,,A] 


[Stability constraint on station 2] 
[ROC Space eonstraints] 


(3.10) 


[Classifieation rate range] 


• -r +-77.-77-7-7^ - bound] 
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IV. ANALYSIS AND RESULTS 


A. CLASSIFICATION OPTIMIZATION MODEL ANALYSIS 

Recall the classification optimization model of (3.10): 

Subject to: 

• ^ [Stability constraint on station 1] 

• /u^> yip - (1 - [Stability constraint on station 2] 

. 0<^<1 [ROC Space constraints] 

• /Uy< iu^< /u^ [Classification rate range] 


• -H- 7 -X-< W [Delay upper bound] 

Throughout we assume that f (i - is continuously differentiable and, 

without loss of generality, that fl^> because otherwise no classification rate is 
feasible. As discussed in Chapter III, / ^1 - ^ is assumed non-decreasing in 1 — ^ and 

non-increasing in //^, with boundary conditions /^,//j^=0, and 

— q . Observe that the expected delay is non-increasing in and non¬ 
decreasing in l-q, and that at optimality we must have /u^ > and 

iu^> f{i-q,luy^p-(\-q)Xj^, for otherwise the delay constraint is violated for any 

W finite. Therefore both of these constraints are slack at optimality. 

To get started, consider the random classification solution, where and 

p = l- q. For q)X ^, setting p = l- q is feasible whenever 
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(//j-/Ij) '+(// 2 ' <V7, meaning that random classification results in 
p = min{ A,'[//j - (W - (//j - if {Ji^ - /IJ ' + <W . The latter shows that 

) < 1 as //j GO. Finally, random classification is unfeasible for 
VF < (//j - Aj) ' + P 2 optimal (p = 1) whenever W ) ‘ + {p^. - ) ‘ and 

/^2 — ^1 • 

For general classification schemes represented by f(i-q,p^, we also have that 

VF <// 2 '+ (//j -results in an unfeasible classification optimization problem, with 

//; = //j, ^ = 1 the only feasible solution for W = ip^- /l^) ' + p^^. 

Let us now consider the remaining range of delay constraints, 
(//j -/Ij) ' + P 2 ^ <W , where the problem is feasible. The standard theory suggests that 
the Karush-Kuhn-Tucker (KKT) conditions are necessary with, as discussed earlier, the 
delay constraint tight at optimality. It is possible, however, that the classification range 

df 

constraints are tight as well. We denote V^/ = — . Hence, the necessary conditions yield 

dx 

three candidate solutions: 


A stationary point with p^< p^ < p^ and KKT multiplier. 


r 






V f(l-q,p^=aV I —^— + 
-“1 V mA _ 2 


Ai-A 


r 




,Ai-A M2- 


Mx-\ M2- 


= fF. 


• The random classification solution discussed 
p = mm{Xy\p 2 -(W - {py - , where p^ = p. 


(4.2a) 

(4.2b) 

(4.2c) 

earlier. 
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• The classification range constraint is tight at the lower bound, ju^= ju^, with 
p = f(\-q,ju^), where 0<^<1 is as small as possible subject to making the 
delay constraint tight. We should point out that this solution cannot be optimal for 
// < T| + (W - /u^ because in that range the delay constraint is not tight for any 
feasible q. 

From these observations, we gather that random classification tends to perform 
very well when W and are relatively large. Unfortunately, it is difficult to obtain 

other qualitative insights without making stronger assumptions about f q,ju^. This is 
precisely the approach we take in the next section, where we implement the model for a 
particular function / (l - q,^^ )• 

B. CLASSIFICATION OPTIMIZATION MODEL: NUMERICAL EXAMPLE 

This section presents numerical results from implementing the classification 
optimization model on various scenarios. We study the effect of the main parameters on 
the performance of the system, highlighting key insights that can be drawn both from the 
change in the optimal value and the arguments that achieve this optimal value. 

The implementation of the model is done using the General Algebraic Modeling 
System (GAMS), which provides a high-level language for the purpose of implementing 
mathematical programming models (Brooke et ah, 1998). 

I. The Model 

Recall from Chapter III that the ROC curve considered here is determined by two 
functional relations: p = f {\-q,s^ for some parameters (see 3.3), and 

the relation between the parameter s and the classification rate /u^: s{^p^) = s . 

In this implementation of the model we assume that f {i-q,s^ = {i-qy where 
£■^[0,1] (see Figure 6 for illustration) and that = ^+ (see 3.5 for 
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derivation), where £ is the classification capability limit, the lowest achievable value of 


the parameter a. 


Let 


\-e , 

a = ——^,b = ^^, 
A-A A-A 


then £■(//!) 


can be written as 


s[p^ = ap^+b, and f {\-q,s) = {\-qy . 


2. Input Parameters 

Recall that the input parameters to the models are the arrival rates, Xp and of 
the positives and negatives, respectively, and the characteristics of the system, which 
include the analysis service rate , the classification rate range J, the acceptable 

delay W , and the classification capability limit, £. 

We consider three scenarios with respect to the arrival rates, all of which 
satisfy Aj =Xp+Aj^ =100. A low value source has /Ip =1, a medium value source has 

Xp = 10 and a high value source has Xp =20. The analysis service rate is assumed to be 
at a constant level of - 20 for all three scenarios and the range of the classification 
rate is [0,500]. The lower bound of zero represents unlimited number of answers that 

can be addressed with respect to each item, and the upper bound of 500 represents the 
average number of items that can be passed on to the analysis phase when no 
classification is made at all. We let the classification capability limit assume its best 
value - a = 0, thus a(^p^) = ap^+b = 0.002p^. Figure 7 shows the value of the function 

f{l-q,p^) for the example above as a function of its variables l-q and p^. We can 
notice that for p^ = p^ we get the linear random guess and for p^ = p^ we get a steep 
curve reaching perfect classification. 
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Basic Setup Unconstrained Sensitivity Plot 



Classification Service Rate - miu1 


Figure 7. as a function of its variables ^ ^ and '^'for 

the described scenario 

Since the lower bound on the delay that is developed in the previous section, 
(//j - + //j' < IF , does not depend on the values of Ap and independently, but on 

the sum of the two, , which stays constant, all three scenarios have the same feasible 
range of W with respect to the input parameters, and it can be shown that it is: 

—i— + —= 0.0525 < IF. 

/^|-A Ml 

Suppose that the acceptable delay assumes one of two levels, where the level is 
determined by the nature of the IR. Tactical IR, such as during tactical engagements on 
the battlefield or “ticking time bomb” scenarios has IF =0.1, which is close to the lower 
bound on the delay, while strategic IR, such as long term armament transactions, has 
IF = 5 , which means a delay of up to 5 time periods. 

For the sake of brevity we name the six scenarios as follows: (i) tactical-low 
(if =0.1,/Ip =l], (ii) tactical-medium (iF =0.1,/lp =10), (iii) tactical-high 
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{W =0.\,Xp=2Q^, (iv) strategic-low (ly = 5,/lp = , (v) strategic-medium 

(ly = 5,/lp = lO) and (vi) strategic-high (ly = 5,Xp= 20). 


3. Service Rate at the Analysis Station {ju^) 


The service rate, //j, at the analysis station is the bottleneck to which the 
classification adjusts its performance. For the discussed scenario, it can be shown that, 
given the value of ly, the feasibility conditions discussed in Chapter IV. A restrict the 


value of ju^ to 


ly- 


10.26 <<110 = T, in the tactical scenario, and to 


M\ \ 


ly- 


: 0.2 < 1^2 < 100-2 = /Ij -I- i in the strategic scenario. 




The graph below (Figure 8) shows the effects of the value of //j in its feasible 

range under the tactical IR for the different scenarios, and for the alternative of no 
classification (see (3.8)). As expected, the returns from an increased analysis rate 
diminish, and when the analysis rate reaches the upper bound, the sensitivity is nearly 
perfect since the analysis is no longer a bottleneck, thus almost all items are analyzed. 

When costs are not taken into account, naturally the optimal sensitivity p , which 
is obtained from solving the model (3.10), is always better comparing the case of no 

P 

classification, in which we denote the sensitivity by Exploring the ratio -, 

Pmin 

presented in Figure 9, allows us to evaluate the effect of classification for a given 
scenario. 
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Optimal Sensitivity vs Analysis Service Rate 



Figure 8. Optimal sensitivity vs. analysis serviee rate (// 2 ) 


Sensitivity Improvement vs. Analysis 
Service Rate (Tactical) 



Sensitivity Improvement vs. Analysis 
Service Rate (Strategic) 



Figure 9. Sensitivity improvement vs. the analysis serviee rate (//j) 
for the taetieal (left) and strategie (right) seenario. 
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Operationally, this reinforces the intuition that classification is most effective 
when (1) the analysis poses a significant bottleneck, because its rate is significantly lower 
than the incoming rate of items, (2) the timeliness is not a crucial factor, and (3) the 

source has a low value in the sense that the rate — 1. 


4. Source Quality 


The quality of the source is represented by the ratio —. To eliminate the effect of 

quantity, meaning the effect of an increase in /I, on the performance of the system, we 
iterate over the value of Tp while keeping the sum /Ij constant at 100. Recall that in the 
basic setup = 20 in any scenario, therefore if it was not for the classification, from 


/«2 


Vk 


(3.8) it would have followed that the minimal sensitivity is =-^ = 0.1 for the 


Ml 


2 ^ 

Vk 


tactical IR and =- — = 0.198 for the strategic IR, regardless of the value ofXp. 


T) 

Figure 10 presents the sensitivity improvement - versus Tp in the range [l,100] for 

Pmm 

the tactical and strategic scenarios. Naturally, the higher the rate of positives is,/lp, the 
lower the sensitivity improvement achieved by the classification. 

The scenario in which the quality of a certain source varies over time is very 
reasonable in reality. When a source is first processed, it may bear large quantities of new 
information, namely positive, and at the extreme even Ap - is a possible scenario. 

However, the more this source is exploited, the less valuable it becomes due to repetitions 
of already known information. In the long run, this means that the source may stabili z e 
on a certain rate of positives, which relates to the rate in which new information appears 
in that source. 
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Figure 10. Sensitivity improvement factor versus source quality 
A A = 100 

( ^ where ) under tactical and strategic IRs 


The graph above reveals an interesting behavior: for low value sources, 
classification at the tactical scenario is more beneficial than in the strategic one. 
Nevertheless, around Ip = //j the improvement curves cross and the classification in the 
strategic scenario is more beneficial than the one in the tactical. 

Another insight can be drawn from the behavior of the decision variables l-q 
and //, at optimality. Recall that both represent the setting of the classification. Figure 

11 shows the normalized classification rate, computed as ^ and the false positive rate 

A, 

\-q for the different values of Ap , when A^ = 100. 


41 





















Optimal Classification Setting vs. 
Source Quality (Tactical) 
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Optimal Classification Setting vs. 
Source Quality (Strategic) 
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Figure 11. Classification setting (normalized classification rate, ^ , and false positive 

A 

rate, \-q, versus Ap , where = 100, under tactical IR (left) and strategic IR 

(right) 


Comparing the graphs in Figure 11 reveals a different behavior at optimality 
between the two scenarios—tactical and strategic—as the source quality improves. In 
both scenarios, an increase in the quality of the source reduces the false positive rate at 
first, however, under the tactical scenario, the higher the quality of the source, the lower 
the quality is of the classification as manifested by the increase in the classification rate. 
On the other hand, under the strategic scenario, the quality of the classification is rather 
constant near its highest feasible value, while the threshold mechanism is relaxed so the 
false positive rate \-q decreases. This difference shows the value of time under the 
different scenarios: while in the tactical scenario the optimal setting reduces the length of 
the test due to the urge to submit the item; in the strategic scenario the preference is to 
keep the test as accurate as possible, despite the required amount of time to do so. 

5. Classification Capability Limit (£) 

The classification capability limit, represents the ability of the classifier to 
address the questions correctly given an unlimited amount of time. This factor takes into 
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account the experience and the training of the classifier: if the limit is at its maximum 
possible value then the classifier is as good as the analyst, and if it is at its minimum, the 
classifier is as good as a random guess. 

Figure 12 reveals the impact of this parameter on the performance of the system 
as manifested by the degradation in the sensitivity when compared to a fully trained 

p* (^) 

classifier, . For the same capability, the system performs better when 

P U = o) 

classifying low value sources under strategic IR. Counter-intuitively, we get that the 
degradation is lower in the case where a high value source is classified. 



Figure 12. Sensitivity degradation vs. classification capability limit under tactical IR 

C. COST-EFFECTIVENESS ANALYSIS 

Figure 13 compares the expected cost per correctly identified positive item in a 
tactical scenario (see Chapter III Section E.I) for various values of costs as a function 
of the source quality. 
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Cost per Correctly Identified Positive vs. 
Source Quality (Tactical) 



Source Quality (Xp) forXi=100 


Figure 13. Cost per correctly identified positive (B) vs. the source quality under 
tactical IR, without classification and with classification with classifier cost 

= 0.1,0.5,1. 


Since B oc — , as the source quality Ap increases, the cost per correctly identified 

Ap 

positive decreases in all studied scenarios. However, the graph shows that the higher the 
quality of the source, the more economical it is to abandon the classification and allow 
analysts direct access to the items. As the quality of the source decreases, the 
implementation of a preliminary classification station outperforms in terms of cost- 
effectiveness, where the point of equality between the two options depends naturally on 
the cost of the classifier, b. 

The operational meaning of this result is that when a source is first processed, it 
should be processed by the analysts until the rate of new information decreases below a 
certain level, which depends on the cost of the classifiers. 
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Since //j and Ap have an identical role in the expression for the cost per correctly 


A. 

identified positive item when no classification is implemented ( B = —^), and in the 

HiAp 

7 ^ 2 

second term when classification is implemented (5 =-^—|- ^similar behavior 

pApju^ pApju^ 


appears when we increase the analysis service rate, //j. 


Another parameter which is of interest to explore in the context of cost- 
effectiveness is the classification capability limit, p. Assume that we are given several 
alternatives for the classifier position, each associated with their classification capability 
limit £ and cost b. Naturally the lower the classification capability limit, the higher the 
cost, until the point in which we are using an analyst equivalent classifier with £ = 0 and 
b = \. For a given scenario, such as the one described in the beginning of this section, we 
can make comparisons among the alternatives using their cost effectiveness or with 
respect to the alternative of no classification based on Figure 14. 


Cost per Correctly Identified Positive vs. 
Classification Capability Limit (Strategic-High) 



Figure 14. Cost per correctly identified positive vs. classification capability limit 

under a strategic IR with high value source. 
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Since the elassifieation capability limit is a product of the training that the 
elassifier receives and his accumulated experience, it ean be determined to some extent 
when strategieally planning the system. In Chapter V, we further discuss a possible 
extension to the elassifieation model that will optimize this training under a given budget 
constraint. 
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V. CONCLUSIONS AND FUTURE WORK 


A. CLASSIFICATION: SETTING AND MEASURES 

The main contribution of this thesis is the formulation and analysis of a 
mathematical model for evaluating and optimizing the classification process in the 
intelligence cycle. The mathematical model, which is an optimization model based on a 
tandem queue, is used to determine tactical parameters in the classification process, while 
accounting for strategic parameters and studying the sensitivity of the optimal 
performance to those parameters. The key features are the ROC curve and its sensitivity 
to strategic parameters and the relations between the speed of the classification and its 
accuracy. 

The model optimizes the sensitivity of the system using the classification setting 
parameters in two degrees of freedom: the length of time spent on each item’s 
classification (service time) which defines the shape of the ROC curve of the 
classification process, and the implemented decision mechanism which determines the 
specific point on that ROC curve. 

The effectiveness of the classification, as manifested by the sensitivity achieved at 
optimality, is measured with respect to the sensitivity of the system when no 
classification is implemented at all. This measure allows us not only to quantify the 
benefit of the classification under a given scenario, but also to compare the added value 
among different scenarios, allowing thumb rules for better classification resource 
allocation. 

B. SENSITIVITY ANALYSIS 

In the classification optimization model, the tactical parameters are adjusted to 
achieve optimality in terms of the sensitivity of the system. In the implementation of the 
model discussed in Chapter IV Section B, two scenario-related parameters are studied: 
the source quality Ap and the expected delay constraint W. We have shown that the higher 

the source quality (with respect to a fixed arrival rate/lj), the less beneficial it is to 
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implement the classification. In addition, for low quality sources, better improvement is 
achieved for IRs that require immediate action; while for high quality sources the 
opposite applies: IRs that allow longer response time benefit more from the classification. 

Another operational implication of the model is inferred from the different 
behavior at optimality between the tactical and strategic scenarios. As the source quality 
increases, in both scenarios the false positive rate should be minimized at first, in order to 
assure that the time share that analysts work on true positives is maximized. 
Nevertheless, at optimality, an increase in source quality results in a decrease in the 
optimal length of the test under a tactical scenario, while the test length stays rather 
constant in the strategic one. 

Two main strategic parameters of the system are explored: the analysis service 
rate {ju^) and the classification capability limit (^). The improvement achieved by 

implementing the optimal classification setting decreases as the analysis service rate 
increases and increases as the classification capability limit decreases. In addition, a cost- 
effectiveness study of the relationship between the classifier cost and the classification 
capability limit is developed as a means for comparing between different classifiers. 

It is shown that when a source is highly valuable, it is more cost-effective to allow 
the analysts to directly interact with the source, despite the limited resources. Given the 
cost of the classification, the breakeven source quality in which both alternatives bear the 
same cost can be estimated. 

C. FUTURE WORK 

The model presented in this thesis forms a basic framework that can be extended 
in several directions that we believe are relevant and beneficial to the intelligence 
community. In this section, we discuss these directions briefly. 

1. Partial Testing and Analysis 

The model discussed in Chapter III assumed that a test, namely a set of questions 
to be addressed regarding a single item, must be fully completed before deeming it as 
positive or negative. A similar assumption is made concerning the bottleneck’s 
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processing, namely the analysis. This assumption is helpful when formulating a model 
that crystallizes the first order effects of the parameters. Nevertheless, a generalized 
model that allows different service time distributions for positive and negative items may 
result in a more precise description of reality and may be a step towards a predictive 
model. 


2. System Optimization Models 

The thesis’ focus is to optimize the classification settings, namely the tactical 
parameters, in a given scenario and to compare the effectiveness among different 
scenarios. Although resources are not always interchangeable between the two phases - 
processing and analysis - due to organizational constraints, as discussed in Chapter II, in 
some cases resources may be shifted from one station to the other. In such a situation the 
need for system optimization models—models that optimize the strategic parameters in 
addition to the tactical ones—arises. We give here two variants of the system 
optimization model as examples for such models. 

In the first example, the model is used to allocate specialists between the two 
phases given a constraint on the total number of specialists. In this case the model 
becomes a tandem M/M/k queue and the optimization is both on the tactical parameters 
and the number of servers in each station. 

In the second model, the cost per time unit of a classifier is functionally related to 
the classification capability limit, where higher costs are associated with a lower 
classification capability limit, in the sense that perfect classification can be achieved. In 
this model, an additional constraint is added to limit the cost-effectiveness of the system. 
The explored tradeoff in this case would be between the classifier’s quality and the time 
spent on classifying each item. 

3. Bias: Deception and Misconception 

A discussion about intelligence cannot be complete without discussing the effects 
of bias that can emerge from deception that is the result of an active counter-intelligence 
effort, or misconception, that is, erroneously adopted beliefs by senior intelligence 
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experts and decision makers. In the presented model, the notion of positives and 
negatives is defined by the analysts’ perception and instructions. Such an assumption 
aligns with user satisfaction measures, since the analysts are the processing phase’s user, 
whose satisfaction is measured by the sensitivity of the classification to the given 
instructions, regardless of their absolute value. 

Nevertheless, it may be of interest to incorporate the bias as a parameter of the 
model, and quantify its effects on the absolute value of the intelligence production 
process, as viewed by the opponent. This sort of model requires a model for the process 
in which the analyst uncovers the true nature of each item by a process of exploration. At 
any given time, the analyst has to divide his time between exploration and exploitation. 
Exploration means that items are processed even if they are declared as negatives, in 
order to find clues for unknown positives. On the other hand, when exploiting, only items 
deemed as positives are processed. 
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